Pyramid vector quantizer shape search

ABSTRACT

An encoder and a method therein for Pyramid Vector Quantizer, PVQ, shape search, the PVQ taking a target vector x as input and deriving a vector y by iteratively adding unit pulses in an inner dimension search loop. The method comprises, before entering a next inner dimension search loop for unit pulse addition, determining, based on the maximum pulse amplitude, maxamp y , of a current vector y, whether more than a current bit word length is needed to represent enloop y , in a lossless manner in the upcoming inner dimension loop. The variable enloop y  is related to an accumulated energy of the vector y. The performing of this method enables the encoder to keep the complexity of the search at a reasonable level.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of patent application Ser. No.14/759,864, filed on Jul. 8, 2015 (published as US 2016-0027449), whichis a National Stage Entry of International Patent application no.PCT/SE2015/050743, filed on Jun. 25, 2015, which claims priority toprovisional patent application No. 62/029,586, filed on Jul. 28, 2014.The above identified applications and publication are incorporated byreference.

TECHNICAL FIELD

The disclosure herein generally relates to vector quantization (VQ)performed by an encoder.

BACKGROUND

It is known that unconstrained vector quantization is the optimalquantization method for grouped samples, i.e. vectors, of a certainlength. However, implementation of unconstrained vector quantizationimplies high requirements in terms of complexity and memory capacity. Adesire to enable implementation of vector quantization also insituations with memory and search complexity constraints, have led tothe development of so-called structured vector quantizers. Differentstructures gives different trade-offs in terms of search complexity andmemory requirements. One such method is the so-called gain-shape vectorquantization, where the target vector t is represented using a shapevector x and a gain value G:

$\begin{matrix}{x = \frac{t}{G}} & \left( {{Eq}.\mspace{11mu} 0} \right)\end{matrix}$

The concept of gain-shape vector quantization is to quantize the pair{x, G} instead of directly quantizing the target vector t. The gain(G)and shape(x) components are encoded using a shape quantizer which istuned for the normalized shape input, and a gain quantizer which handlesthe dynamics of the signal. This gain-shape structure is frequently usedin audio coding since the division into dynamics and shape, also denotedfine structure, fits well with the perceptual auditory model. Thegain-shape concept can also be applied to Discrete Cosine Transformcoefficients or other coefficients used in video coding.

Many speech and audio codecs such as ITU-T G.718 and IETF Opus (RFC6716) use a gain-shape VQ based on a structured PVQ in order to encodethe spectral coefficients of the target speech/audio signal.

The PVQ-coding concept was introduced by R. Fischer in the time span1983 -1986 and has evolved to practical use since then with the adventof more efficient Digital Signal Processors, DSPs. The PVQ encodingconcept involves searching for, locating and then encoding a point on anN-dimensional hyper-pyramid with the integer L1-norm of K unit pulses.The so-called L1-norm is the sum of the absolute values of the vector,i.e. the absolute sum of the signed integer PVQ vector is restricted tobe exactly K, where a unit pulse is represented by an integer value of“1”. A signed integer is capable of representing negative integers, inrelation to unsigned which can only represent non-negative integers.

One of the interesting benefits with the structured PVQ-coding approachin contrast to many other structured VQs is that there is no inherentlimit in regard of a dimension N, so the search methods developed forPVQ-coding should be applicable to any dimension N and to any K value.

One issue with the structured PVQ-shape quantization is to find the bestpossible quantized vector using a reasonable amount of complexity. Forhigher rate speech and audio coding, when the number of allowed unitpulses K, may become very high and the dimension N may also be high,there is even stronger demands on having an efficient PVQ-search, whilemaintaining the quality, e.g. in terms of Signal to Noise Ratio, SNR, ofthe reconstructed speech/audio.

Further, the use of the PVQ concept is not restricted to the speech andaudio coding area. Currently, the so-called Internet Engineering TaskForce, IETF, is pursuing a video codec development where Discrete CosineTransform, DCT, coefficients are encoded using a PVQ-based algorithm. Invideo coding it is even more important than in audio coding to have anefficient search procedure, as the number of coefficients may becomevery large with large displays.

SUMMARY

For a structured PVQ it is desired to enable a computationally efficientshape search which still provides a high Signal to Noise ratio.Especially for implementations involving a fixed precision DSP. Thesolution provided herein enables a computationally efficient PVQ shapesearch, by providing an improved PVQ fine search.

According to a first aspect, a method is provided for PVQ shape search,to be performed by an encoder. The PVQ is assumed to involve taking atarget vector x as input deriving a vector y by iteratively adding unitpulses in an inner dimension search loop. The provided method comprises,before entering a next inner dimension search loop for unit pulseaddition: determining, based on a maximum pulse amplitude, maxamp_(y),of a current vector y, whether more than a current bit word length isneeded to represent, in a lossless manner, a variable, enloop_(y). Thevariable enloop_(y) being related to an accumulated energy of y, in theupcoming inner dimension loop.

According to a second aspect, an encoder is provided, for PVQ shapesearch. The PVQ is assumed to involve taking a target vector x as inputderiving a vector y by iteratively adding unit pulses in an innerdimension search loop. The provided encoder is configured to, beforeentering a next inner dimension search loop for unit pulse addition:determine, based on a maximum pulse amplitude, maxamp_(y), of a currentvector y, whether more than a current bit word length is needed torepresent, in a lossless manner, a variable, enloop_(y). The variableenloop_(y) being related to an accumulated energy of y, in the upcominginner dimension loop.

The method may comprise and the encoder may be configured to, beforeentering a next inner dimension loop for unit pulse addition:determining, based on a maximum absolute value, xabS_(max), of the inputvector, x, a possible upshift, in a bit word, of the next loop'saccumulated in-loop correlation value, corr_(xy), between x and thevector y.

The method may comprise and the encoder may be configured to, when morethan a current bit word length is needed to represent enloop_(y),perform the inner loop calculations using a longer bit word length torepresent enloop_(y).

The method may comprise and the encoder may be configured to, when morethan a current bit word length is needed to represent enloop_(y),perform the inner loop calculations using a longer bit word length torepresent a squared accumulated in-loop correlation value, corr_(xy) ²,between x and the vector y, in the inner loop.

The method may further comprise and the encoder may be configured to,when more than a current bit word length is not needed to representenloop_(y):

-   -   performing the inner loop calculations by employing a first unit        pulse addition loop using a first bit word length to represent        enloop_(y), and:

when more than a current bit word length is needed to representenloop_(y):

-   -   performing the inner loop calculations by employing a second        unit pulse addition loop using a longer bit word length to        represent enloop_(y) than the first unit pulse addition loop.

The determining, based on maxamp_(y), of whether more than a current bitword length is needed to represent enloop_(y) may comprise determiningcharacteristics of the case when, in the upcoming inner search loop, theunit pulse is added to the position in y being associated withmaxamp_(y).

The method may further comprise and the encoder may be configured to, inthe inner dimension search loop for unit pulse addition:

-   determining a position, n_(best), in y for addition of a unit pulse    by evaluating a cross-multiplication, for each position n in y, of a    correlation and energy value for the current n; and a squared    correlation, BestCorrSq and an energy value, bestEn, saved from    previous values of n, as:

corr_(xy) ²*bestEn>BestCorrSq*enloop_(y)

where

$\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestEn} = {enloop}_{y}} \\{{BestCorrSq} = {corr}_{xy}^{2}}\end{matrix} \right\},{{{when}\mspace{14mu} {corr}_{{xy}\;}^{2}*{bestEn}} > {{BestCorrSq}*{enloop}_{y}}}} & \;\end{matrix}$

The method may also comprise and the encoder be configured to keep trackof maxamp_(y) when a final value of K, associated with the target vectorx, exceeds a threshold value. Here the method may comprise and theencoder may be configured to calculate an energy margin, en_margin, onlyif a current value of K exceeds a threshold value which may be thethreshold value mentioned in the preceding sentence.

According to a third aspect, a communication device is provided, whichcomprises an encoder according to the second aspect.

According to a fourth aspect, a computer program is provided, whichcomprises instructions which, when executed on at least one processor,such as a DSP, cause the at least one processor to carry out the methodaccording to the first aspect.

According to a fifth aspect, a carrier is provided, containing thecomputer program according to the fourth aspect, the carrier being oneof an electronic signal, optical signal, radio signal, or computerreadable storage medium.

According to a sixth aspect, and encoder is provided, which isconfigured for PVQ shape search; the PVQ taking a target vector x asinput and deriving a vector y by iteratively adding unit pulses in aninner dimension search loop. The provided encoder comprises a firstdetermining unit for, before entering a next inner dimension search loopfor unit pulse addition, determining, based on a maximum pulseamplitude, maxamp_(y), of a current vector y, whether more than acurrent bit word length is needed to represent, in a lossless manner, avariable, enloop_(y), related to an accumulated energy of y, in theupcoming inner dimension loop.

The Encoder according to the sixth aspect may comprise a seconddetermining unit for, before entering a next inner dimension loop forunit pulse addition, determining, based on a maximum absolute value,xabs_(max), of the input vector, x, a possible upshift, in a bit word,of the next loop's accumulated in-loop correlation value, corr_(xy),between x and the vector y.

The Encoder according to the sixth aspect may comprise a fine searchunit for performing the inner loop calculations using a longer bit wordlength to represent enloop_(y), when more than a current bit word lengthis needed to represent enloop_(y).

The Encoder according to the sixth aspect may comprise a fine searchunit for:

-   -   performing the inner loop calculations by employing a first unit        pulse addition loop using a first bit word length when more than        a current bit word length is not needed to represent enloop_(y),        and:    -   performing the inner loop calculations by employing a second        unit pulse addition loop using a longer bit word length than the        first unit pulse addition loop when more than a current bit word        length is needed to represent enloop_(y).

The Encoder according to the sixth aspect may comprise a fine searchunit for

-   -   performing the inner loop calculations by employing a first unit        pulse addition loop, having a certain precision, when more than        a current bit word length is not needed to represent enloop_(y);        and to    -   performing the inner loop calculations by employing a second        unit pulse addition loop, having a higher precision than the        first unit pulse addition loop, when more than a current bit        word length is needed to represent enloop_(y).

The Encoder according to the sixth aspect may be configured to performthe determining, based on maxamp_(y), of whether more than a current bitword length is needed to represent enloop_(y) by determiningcharacteristics of the case when, in the upcoming inner search loop, theunit pulse is added to the position in y being associated withmaxamp_(y).

The Encoder according to the sixth aspect may comprise a fine searchunit for, in the inner dimension search loop for unit pulse addition,:

-   determining a position, n_(best), in y for addition of a unit pulse    by evaluating a cross-multiplication, for each position n in y, of a    correlation and energy value for the current n; and a correlation,    BestCorrSq, and energy value, bestEn, saved from previous values of    n, as:

corr_(xy) ²*bestEn>BestCorrSq*enloop_(y)

where

$\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestEn} = {enloop}_{y}} \\{{BestCorrSq} = {corr}_{xy}^{2}}\end{matrix} \right\},{{{when}\mspace{14mu} {corr}_{{xy}\;}^{2}*{bestEn}} > {{BestCorrSq}*{enloop}_{y}}}} & \;\end{matrix}$

The Encoder according to the sixth aspect may comprise a storing unitfor keeping track of maxamp_(y) when a number of final unit pulses, K,associated with the target vector x, exceeds a threshold value.

According to a seventh aspect, a communication device is provided, whichcomprises an encoder according to the sixth aspect.

BRIEF DESCRIPTION OF DRAWINGS

The foregoing and other objects, features, and advantages of thetechnology disclosed herein will be apparent from the following moreparticular description of embodiments as illustrated in the accompanyingdrawings. The drawings are not necessarily to scale, emphasis insteadbeing placed upon illustrating the principles of the technologydisclosed herein.

FIGS. 1-4 illustrate a method for PVQ shape search (fine search),according to different exemplifying embodiments.

FIG. 5 shows steps of an embodiment of a PVQ shape search (fine search),according to an exemplifying embodiment.

FIG. 6 shows steps of the PVQ shape search (fine search) of FIG. 5 inmore detail, according to an exemplifying embodiment.

FIG. 7 illustrates embodiments of a PVQ-shape search.

FIG. 8 shows an embodiment of a communication device equipped with anEVS encoder.

FIG. 9 shows an embodiment of a communication device, and

FIG. 10 also shows an embodiment of a communication device.

FIG. 11a-c show an encoder according to exemplifying embodiments.

FIG. 12 shows an example of a PVQ audio coding system, where at leastone part of the system is comprised in an encoder and/or codec which inturn is comprised in a communication device, such as a mobile phone.

DETAILED DESCRIPTION

In floating point arithmetic there is no major issue related toestablishing the dynamics of inner loop PVQ shape search iterationparameters, however in fixed precision DSPs with e.g. 16/32 bit limitedaccumulators (a register in which intermediate arithmetic and/or logicresults are stored) and variables, it is very important to employefficient search methods where the limited dynamic range of the DSPvariables is maximized and the precision is maximized, while being ableto use as many of available fast limited-dynamic range DSP operations aspossible.

The term “precision” above refers to being able to represent as smallnumbers as possible, i.e. the number of bits after the decimal point fora specific word length. Another way of saying it is that the precisioncorresponds to the resolution of the representation, which again isdefined by the number of decimal or binary digits. The reason for thatthe precision in embodiments described below may be said to correlatewith the number of bits after the decimal point and not necessarily withthe word length itself is that in fixed point arithmetics, there may bedifferent precisions for the same word length. For example, the dataformats 1Q15 and 2Q14 both have word length 16, but the first one has 15bits after the decimal point and the other 14 bits. The smallest numberrepresentable would then be 2A-15 and 2A-14 respectively.

A way of performing pyramid vector quantization of the shape isdisclosed in section 3.2 of Valin et.al, “A full-bandwidth audio codecwith low complexity and very low delay”, EUSIPCO, 2009. In this documentan MDCT codec is presented where the details, i.e. the shape, in eachband are quantized algebraically using a spherical codebook and wherethe bit allocation is inferred from information shared between theencoder and the decoder. Aspects and embodiments of the disclosure ofthis application at least loosely relate to how to do a search accordingto Equations 4-7 in Valin et. al., in an efficient way in fixed pointlimited to e.g. 16/32 bit arithmetic instead of float values as in Valinet. al.

In some aspects and embodiments disclosed hereinafter, given a targetvector x(n) (t in Equation 0) of certain dimension N, and given acertain number of unit pulses K, the shape is analyzed and a suitablereconstruction vector x_(q)(n)=func(y(n)), which minimizes the shapequantization error, and thus maximizes a perceived quality e.g. in caseof audio coding, is determined. At least some of the aspects andembodiments are implemented to aim for a finding of the optimalconstellation of K unit pulses, in a vector y(n) which needs to adhereto the L1 norm, while keeping the complexity under control, i.e.as lowas practically possible.

Instead of using prior art open loop methods to determine approximatevalues for the inner loop dynamic range and accumulator precision, someof the aspects and embodiments are designed to use low cost, in terms ofDSP cycles needed and in terms of additional Program Read-Only Memory(ROM) needed, “near optimal” pre-analysis of the worst case numeratorand/or worst case denominator before starting the costly evaluations ofthe PVQ-shape distortion quotient in the innermost search loop. The“near-optimal” pre-analysis is not targeting to scale the values to theexact optimal maximum dynamic range, but instead the pre-analysisdetermines the near-optimal power of 2 scale factor, as power of 2scaling may be implemented as shifts of a binary number and such shiftshave a low cost in DSP cycles and in DSP ROM.

The denominator precision selection is perceptually motivated asspectrally peaky regions will be allocated more precision than flatterregions.

While some of the main concepts described in this disclosure covervarious modifications and alternative constructions, embodiments of theaspects are shown in drawings and exemplary code and will hereinafter bedescribed in detail.

PVQ-Search General Optimization Introduction

An L1-norm structured PVQ-quantizer allows for several searchoptimizations, where a primary optimization is to move the target to theall positive “quadrant” (could also be denoted orthant or hyper octant)in N-dimensional space and a second optimization is to use an L1-normprojection as a starting approximation for y(n). An L1-norm of K for aPVQ(N,K) means that the absolute sum of all elements in the PVQ-vectory(n) has to be K, just as the absolute sum of all elements in the targetshape vector x(n).

A third optimization is to iteratively update Q_(PVQ) quotient termscorr_(xy) ² and energy_(y), instead of re-computing Eq. 4 (below) overthe whole vector space N, for every candidate change to the vector y(n)in pursuit of reaching the L1-norm K, which is required for a subsequentindexing step.

The above three major optimization steps are optimizations whichgenerally may exist in past PVQ-implementations such as CELT andIETF-Opus, and partly in G.718, however for the completeness of thedescription of aspects and embodiments, these steps are also brieflyoutlined below.

Efficient PVQ Vector Shape Search

An overview of an audio encoding and decoding system applying anembodiment of the herein proposed PVQ shape search can be seen in FIG.12. A general shape search using a pyramid projection followed by a fine(shape) search flow can be seen e.g. in FIG. 5. Another embodiment of afine search part of a shape search is depicted in FIG. 6. A PVQ shapesearch may comprise a pyramid projection and a fine search. When nopyramid projection is applied, the shape search only comprises the finesearch. Therefore, “fine search” and “shape search” may sometimes beused interchangeably herein, since the fine search is a part of theshape search, and when there is no initial coarse search, by pyramidprojection, performing of the shape search is even the same thing asperforming the fine search. In other words, the fine search maysometimes be or constitute the shape search, and when pyramid projectionis applied, the fine search is a part of the shape search.

PVQ-Search Introduction

The goal of the PVQ(N,K) search procedure is to find the best scaled andnormalized output vector x_(q)(n). x_(q)(n) is defined as:

$\begin{matrix}{x_{q} = \frac{y}{\sqrt{y^{T}y}}} & \left( {{Eq}.\mspace{11mu} 1} \right)\end{matrix}$

Where y=y_(N,K) is a point on the surface of an N-dimensionalhyper-pyramid and the L1 norm of y_(N,K) is K. In other words, y_(N,K)(is the selected integer shape code vector of size N, also denoteddimension N, according to:

$\begin{matrix}{y_{N,K} = \left\{ {{e\text{:}{\sum\limits_{i = 0}^{N - 1}{e_{i}}}} = K} \right\}} & \left( {{Eq}.\mspace{11mu} 2} \right)\end{matrix}$

That is, the vector x_(q) is the unit energy normalized integer subvector y_(N,K) The best y vector is the one minimizing the mean squaredshape error between the target vector x(n) and the scaled normalizedquantized output vector x_(q). This is achieved by minimizing thefollowing search distortion:

$\begin{matrix}{d_{PVQ} = {{{- x^{T}}x_{q}} = {- \frac{\left( {x^{T}y} \right)}{\sqrt{y^{T}y}}}}} & \left( {{Eq}.\mspace{11mu} 3} \right)\end{matrix}$

Or equivalently, by squaring numerator and denominator, maximizing thequotient Q_(PVQ):

$\begin{matrix}{Q_{PVQ} = {\frac{\left( {x^{T}y} \right)^{2}}{y^{T}y} = \frac{\left( {corr}_{xy} \right)^{2}}{{energy}_{y}}}} & \left( {{Eq}.\mspace{11mu} 4} \right)\end{matrix}$

where corr_(xy) is the correlation between x and y. In the search of theoptimal PVQ vector shape y(n) with L1-norm K, iterative updates of theQ_(PVQ) variables are made in the all positive “quadrant” inN-dimensional space according to:

corr_(xy)(k, n)=corr_(xy)(k−1)+1·x(n)   (Eq. 5)

energy_(y)(k,n)=energy_(y)(k−1)+2·1² ·y(k−1,n)+1²   (Eq. 6)

where corr_(xy)(k−1) signifies the correlation achieved so far byplacing the previous k−1 unit pulses, and energy_(y)(k−1) signifies theaccumulated energy achieved so far by placing the previous k−1 unitpulses, and y(k−1, n) signifies the amplitude of y at position n fromthe previous placement of k−1 unit pulses. To further speed up thein-loop iterative processing the energy term energy_(y)(k) is scaleddown by 2, thus removing one multiplication in the inner-loop.

enloop_(y)(k,n)=energy_(y)(k,n)/2,

enloop_(y)(k,n)=enloop_(y)(k−1)+y(k−1,n)+0.5   (Eq. 7)

where enloop_(y)(k,n) is the preferred energy variable used andaccumulated inside the innermost unit pulse search loop, as itsiterative update requires one multiplication less than energy_(y)(k,n).

$\begin{matrix}{{Q_{PVQ}\left( {k,n} \right)} = \frac{{{corr}_{xy}\left( {k,n} \right)}^{2}}{{enloop}_{y}\left( {k,n} \right)}} & \left( {{Eq}.\mspace{11mu} 8} \right)\end{matrix}$

The best position n_(best) for the k'th unit pulse, is iterativelyupdated by increasing n from 0 to N−1:

n _(best) =n, if Q _(PVQ)(k,n)>Q _(PVQ)(k,n _(best))   (Eq. 9)

To avoid costly divisions, which is especially important in fixed pointarithmetic, the Q_(PVQ) maximization update decision is performed usinga cross-multiplication of the saved best squared correlation numeratorbestCorrSq and the saved best energy denominator bestEn so far, whichcould be expressed as:

$\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestCorrSq} = {{corr}_{xy}\left( {k,n} \right)}^{2}} \\{{bestEn} = {{enloop}_{y}\left( {k,n} \right)}}\end{matrix} \right\},{{{if}\mspace{14mu} {{{corr}_{xy}\left( {k,n} \right)}^{2} \cdot {bestEn}}} > {{bestCorrSq} \cdot {{enloop}_{y}\left( {k,n} \right)}}}} & \left( {{Eq}.\mspace{11mu} 10} \right)\end{matrix}$

The iterative maximization of Q_(PVQ)(k, n) may start from a zero numberof placed unit pulses or from an adaptive lower cost pre-placementnumber of unit pulses, based on an integer projection to a point belowthe K'th-pyramid's surface, with a guaranteed undershoot of unit pulsesin the target L1 norm K.

PVQ Search Preparation Analysis

Due to the structured nature of the y_(N,K) PVQ integer vector, whereall possible sign combinations are allowed and it is possible to encodeall sign combinations, as long as the resulting vector adheres to the L1norm of K unit pulses, the search is performed in the all positive first“quadrant” (the reason for the citation marks on “quadrant” is that atrue quadrant only exists when N=2, and N may here be more than 2).Further, as realized by the inventor, to achieve as a high accuracy aspossible for a limited precision implementation, the maximum absolutevalue xabs_(max) of the input signal x(n) may be pre-analyzed for futureuse in the setup of the inner loop correlation accumulation procedure.

xabs(n)=|x(n)|, for n=0, . . . , N−1   (Eq. 11)

xabs_(max)=max(xabs₀ , . . . , xabs_(N−1))   (Eq. 12)

Handling of Very Low Energy Targets and Very Low Energy Sub-Vectors

In case the input target vector(x in Eq. 3 or tin Eq. 0) is an all zerovector and/or the vector gain (e.g. G in Eq. 0) is very low, thePVQ-search may be bypassed, and a valid PVQ-vector y may bedeterministically created by assigning half of the K unit pulses to thefirst position

$\left( {{y\lbrack 0\rbrack} = \left\lfloor \frac{K}{2} \right\rfloor} \right)$

and the remaining unit pulses to the last position(y[N−1]=y[N−1]+(K−y[0])).

The term “very low energy targets” and “very low vector gain” is in oneembodiment as low as zero, as illustrated in the exemplary ANSI C-codedisclosed below, where the corresponding code is:

IF( L_xsum == 0 || neg_gain == 0 ) { /* zero input or zero gain case */

However, it may also be less than or equal to epsilon, or EPS, where EPSis the lowest value which is higher than zero and which is regarded asbeing worth representing in a selected precision. For example, in aprecision Q15 in a signed 16 bit word, the sub-vector gain becomes lessor equal to EPS ½̂15= 1/32768 (e.g. vector gain less or equal to0.000030517578125), and in case of precision Q12 in a signed 16 bit wordfor target vector x(n), then the “very low” value becomes EPS=(½̂12),e.g. sum (abs (x(n))) less or equal to 0.000244140625. In one embodimentof fixed-point DSP arithmetics with 16 bit word, an unsigned integerformat may take any integer value from 0 to 65546, whereas a signedinteger may take the value of −32768 to +32767. Using unsignedfractional format, the 565536 levels are spread uniformly between 0 and+1, whereas in a signed fractional format embodiment the levels would beequally spaced between −1 and +1.

By applying this optional step related to zero-vectors and low gainvalues, the PVQ-search complexity is reduced and the indexing complexityis spread/shared between encoder indexing and decoder de-indexing, i.e.no processing is “wasted” for searching a zero target vector or a verylow target vector which would in any way be scaled down to zero.

Optional PVQ Pre-Search Projection

If the pulse density ratio K/N is larger than 0.5 unit pulses percoefficient, e.g. modified discrete cosine transform coefficient, a lowcost projection to the K−1 sub pyramid is made and used as a startingpoint for y. On the other hand, if the pulse density ratio is less than0.5 unit pulses per coefficient, the iterative PVQ-search will start offfrom 0 pre-placed unit pulses. The low cost projection to “K−1” istypically less computationally expensive in DSP cycles than repeatingthe unit pulse inner loop search K−1 times. However, a drawback of thelow cost projection is that it will produce an inexact result due to theN-dimensional floor function application. The resulting L1-norm of thelow cost projection using the floor function may typically be anythingbetween “K−1” to roughly “K−5”, i.e. the result after the projectionneeds to be fine searched to reach the target norm of K.

The low cost projection is performed as:

$\begin{matrix}{{proj}_{fac} = \frac{K - 1}{\sum\limits_{n = 0}^{n = {N - 1}}{x\; {{abs}(n)}}}} & \left( {{Eq}.\mspace{11mu} 13} \right) \\{{{y(n)} = {{y_{start}(n)} = \left\lfloor {x\; {{{abs}(n)} \cdot {proj}_{fac}}} \right\rfloor}}{{{{for}\mspace{14mu} n} = 0},\ldots \;,{N - 1}}} & \left( {{Eq}.\mspace{11mu} 14} \right)\end{matrix}$

If no projection is made, the starting point is an all zeroed y(n)vector. The DSP cost of the projection in DSP cycles is in theneighborhood of N(absolute sum)+25(the division)+2N(multiplication andfloor) cycles.

In preparation for the fine search to reach the K'th-pyramid's surfacethe accumulated number of unit pulses pulse_(tot), the accumulatedcorrelation corr_(xy)(pulset_(tot)) and the accumulated energyenergy_(y)(pulse_(tot)) for the starting point is computed as:

$\begin{matrix}{{pulse}_{tot} = {\sum\limits_{n = 0}^{n = {N - 1}}{y(n)}}} & \left( {{Eq}.\mspace{11mu} 15} \right) \\{{{corr}_{xy}\left( {pulse}_{tot} \right)} = {\sum\limits_{n = 0}^{n = {N - 1}}{{{y(n)} \cdot x}\; {{abs}(n)}}}} & \left( {{Eq}.\mspace{11mu} 16} \right) \\{{{energy}_{y}\left( {pulse}_{tot} \right)} = {{\sum\limits_{n = 0}^{n = {N - 1}}{{y(n)} \cdot {y(n)}}} = {y}_{L\; 2}}} & \left( {{Eq}.\mspace{11mu} 17} \right) \\{{{enloop}_{y}\left( {pulse}_{tot} \right)} = {{{energy}_{y}\left( {pulse}_{tot} \right)}\text{/}2}} & \left( {{Eq}.\mspace{11mu} 18} \right)\end{matrix}$

PVQ Fine Search

The solution disclosed herein is related to the PVQ fine search (whichconstitutes or is part of the PVQ-shape search, as previouslydescribed). What has been described in the preceding sections is mainlyprior art PVQ, except for the upfront determining of xabs_(max), whichwill be further described below. The final integer shape vector y(n) ofdimension N must adhere to the L1 norm of K pulses. The fine search isassumed to be configured to start from a lower point in the pyramid,i.e. below the K'th pyramid, and iteratively find its way to the surfaceof the N-dimensional K'th-hyperpyramid. The K-value in the fine searchcan typically range from 1 to 512 unit pulses.

The inventor has realized, that in order to keep the complexity of thesearch and PVQ indexing at a reasonable level, the search may be splitinto two main branches, where one branch is used when it is known thatthe in-loop energy representation of y(n) will stay within a signed, orunsigned, 16 bit word during a next inner search loop iteration, andanother branch is used when the in-loop energy may exceed the dynamicrange of a 16 bit word during a next inner search loop iteration.

Fixed Precision Fine Search for a Low Number of Unit Pulses

When the final K is lower than or equal to a threshold of t_(p)=127 unitpulses, the dynamics of the energy_(y)(K) will always stay within 14bits, and the dynamics of the 1 bit upshifted enloop_(y)(K) will alwaysstay within 15 bits. This allows use of a signed 16 bit word forrepresenting every enloop_(y)(k) within all the fine pulse search innerloop iterations up to k=K. In other words, there will be no need for aword bit length exceeding 16 bits for representing energy_(y)(K) orenloop_(y)(K) in any fine pulse search inner loop iteration when K<127.

In the case of the availability of efficient DSP Multiply, MultiplyAdd(multiply-add) and MultiplySubtract (multiply-subtract) operators forunsigned 16 bit variables, the threshold can be increased to t_(p)=255,as then enloop_(y)(K) will always stay within an unsigned 16 bit word.MultiplyAdd is here in one embodiment multiply-add instructions orequivalent operations to multiply data values representing audio andvideo signals by filter or transform values and accumulate the productsto produce a result. MultiplySubtract operations are the same as theMultiplyAdd operations, except the adds are replaced by subtracts.

In preparation for the next unit pulse addition, the near optimalmaximum possible upshift of the next loop's accumulated in-loopcorrelation value, corr_(xy), in a signed 32 bit word is pre-analyzedusing the previously calculated maximum absolute input value xabs_(max)as:

corr_(upshift)=31−[log2(corr_(xy)(pulse_(tot))+2·(1·xabs_(max)))]  (Eq.19)

This upshift calculated in Eq 19 represents the “worst case”, and coversthe maximum possible upshift that can be done in the next inner loop,and thus ensures that the most significant information related tocorrelation will not be lost, or outshifted, during the inner loopiteration, even for the worst case scenario.

This worst case pre-inner loop dynamic analysis can be performed in 2-3cycles in most DSP architectures using MultiplyAdd and Norm instructions(normalization), and the analysis is always the same independent of thedimension N. In an ITU-T G.191 virtual 16/32-bit DSP the operations inEq.19 become: “corr_upshift=norm_I(L_mac(*L_corrxy,1, xabs_max));” witha cost of 2 cycles. It should be noted that norm_I(x) here correspondsto “31−ceil(log2(x))”, and could alternatively be denoted31−ceil(log2(x)), where ceil(x) is the so-called ceiling function,mapping a real number to the smallest following integer. More precisely,ceiling(x)=[x] is the smallest integer not less than x. Forcorr_(upshift), the term within the brackets with upper horizontal baris always a positive number. The corr_(upshift) could alternatively becalculated using a floor function as:

corr_(upshift)=30−[log2(corr_(xy)(pulse_(tot))+2·(1·xabs_(max)))]

where floor(x)=[x] is the largest integer not greater than x.

Another benefit of the herein suggested approach to near optimal shapesearch correlation scaling is that the proposed method does not requirea pre-normalized target vector x, which will save some additionalcomplexity before starting the shape search.

To make the iterative Eq. 10 update as efficient as possible, thecorr_(xy)(k,n)² numerator may be represented by a 16 bit signed word,even when comprising more information than fits in a 16 bit word, by thefollowing approach.

$\begin{matrix}{{{corr}_{{xy}\; 16}\left( {k,n} \right)}^{2} = \left( {{Round}_{16}\left( {\left( {{{corr}_{xy}\left( {pulse}_{tot} \right)} + {2 \cdot \left( {{1 \cdot x}\; {{abs}(n)}} \right)}} \right) \cdot 2^{{corr}_{upshift}}} \right)} \right)^{2}} & \left( {{Eq}.\mspace{11mu} 20} \right)\end{matrix}$

$\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestCorrSq}_{16} = {{corr}_{{xy}\; 16}\left( {k,n} \right)}^{2}} \\{{bestEn}_{16} = {{enloop}_{y}\left( {k,n} \right)}}\end{matrix} \right\},{{{if}\mspace{14mu} {{{corr}_{{xy}\; 16}\left( {k,n} \right)}^{2} \cdot {bestEn}_{16}}} > {{bestCorrSq}_{16} \cdot {{{enloop}_{y}\left( {k,n} \right)}.}}}} & \left( {{Eq}.\mspace{11mu} 21} \right)\end{matrix}$

where the function “Round₁₆” extracts the top 16 bits of a signed 32 bitvariable with rounding. This near optimal upshift (Eq. 10) and the useof 16 bit representation of the squared correlation bestCorrSq₁₆ enablesa very fast inner-loop search using only ˜9 cycles for performing theEq. 21 test and the three variable updates, when using a DSPs optimizedMultiply, MultiplyAdd, MultiplySubtract functions.

The location of the next unit pulse in the vector y is now determined byiterating over the n=0, . . . , N−1 possible positions in vector y,while employing equations Eq 20, Eq 6 and Eq 21.

When the best position n_(best) for the unit pulse (in the vector yachieved so far) has been determined, the accumulated correlationcorr_(xy)(k), the accumulated inloop energy enloop_(y)(k) and the numberof accumulated unit pulses pulse_(tot) are updated. If there are furtherunit pulses to add, i.e. when pulse_(tot)<K, a new inner-loop is startedwith a new near optimal corr_(upshift) analysis (Eq.19) for the additionof a next unit pulse.

In total, this suggested approach has a worst case complexity for eachunit pulse added to y(n) of roughly 5/N+15 cycles per quantizedcoefficient. In other words, a loop over a vector of size N for adding aunit pulse has a worst case complexity of about N*(5/N+15) cycles, i.e.5+15*N cycles.

Fixed Precision Fine Search for a High Number of Unit Pulses

When K is higher than a threshold t_(p), which in this exemplifyingembodiment assuming a 16/32 bit restricted DSP, is t_(p)=127 unitpulses, the dynamics of the parameter energy_(y)(K) may exceed 14 bits,and the dynamics of the 1 bit upshifted enloop_(y)(K) may exceed 15bits. Thus, in order not to use unnecessarily high precision, the finesearch is configured to adaptively choose between 16 bit representationand 32 bit representation of the pair {corr_(xy)(k,n)², enloop_(y)(k,n)}when K is higher than t_(p). When K for the vector y(n) is known to endup in a final value higher than 127 in advance, the fine search willkeep track of the maximum pulse amplitude maxamp_(y) in y achieved sofar. This may also be referred to as that maxamp_(y) is determined. Thismaximum pulse amplitude information is used in a pre-analysis stepbefore entering the optimized inner dimension loop. The pre-analysiscomprises determining of what precision should be used for the upcomingunit pulse addition inner-loop. As shown in FIG. 12 by the input of N, Kto the PVQ-shape search, the bit allocation is known/determined beforethe PVQ search is initiated. The bit allocation may use formulas orstored tables for obtaining, determining and/or calculating the K to beinput to the PVQ-shape search, e.g. K=function(bits(band), N) for acertain band with the dimension N and a certain number of bits(band).

bits required for PVQ(N, K) N = 8 N = 16 N = 32 K = 4 11.4594 15.426319.4179 K = 5 13.2021 18.1210 23.1001 K = 6 14.7211 20.5637 26.5222 K =7 16.0631 22.7972 29.7253

For example, a stored table as the one shown above may be used todetermine or select a value of K. If the dimension N is 8 and theavailable bits for the band bits(band) is 14.0, then K will be selectedto be 5, as PVQ(N=8,K=6) requires 14.7211 bits which is higher than thenumber of available bits 14.0.

If the pre-analysis indicates that more than a signed 16 bit word isneeded to represent the in-loop energy without losing any energyinformation, a higher precision and computationally more intensive highprecision unit pulse addition loop is employed, where both the savedbest squared correlation term and the saved best accumulated energy termare represented by 32 bit words.

en_(margin)=31−[log2((1+energy_(y)(pulse_(tot))+2·(1·maxamp_(y))))]  (Eq.22)

highprecision_(active)=FALSE, if(en _(margin)≧16)

highprecision_(active)=TRUE, if(en _(margin)<16)   (Eq. 23)

The worst case pre-inner loop dynamic analysis can be performed in 5-6additional cycles in most DSP's, and the analysis cost is the same forall dimensions. In an ITU-T G.191 STL 2009 virtual 16/32 bit DSP theoperations in Eq.22 and Eq 23 becomes:

“L_energy_y  = L_add(L_energy_y, 1);  /* 0.5 added */ en_margin   =norm_l(L_mac(L_energy_y, 1, maxamp_y)); highprecision_active= 1; move16(); if(sub(16,en_margin <= 0){  highprecision_active = 0; move16( ); }”,with a cost of maximum 6 cycles.

The corresponding code in an ANSI-C code example below is:

L_yy  = L_add(L_yy,1); /* .5 added */ en_margin  = norm_l(L_mac(L_yy,1,max_amp_y)); /*find max “addition”, margin,~2 ops */ en_dn_shift =sub(16, en_margin); /* calc. shift to lower word */ high_prec_active =1; move16( ); if( en_dn_shift <= 0 ){ /* only use 32 bit energy ifactually needed */ high_prec_active = 0; move16( ); }

Alternatively the energy margin en_margin in Eq(22) could in line withan operation of the G.191 STL function norm_I( ) be calculated using thefloor function as:

en _(margin)=30−[log2((1+energy_(y)(pulse_(tot))+2·(1·maxamp_(y))))]

If highprecision_(active) is FALSE, i.e.=0, the lower precision innersearch loop in Eq 20, Eq 6 and Eq21 is employed, on the other hand, whenhighprecision_(active) is TRUE, i.e.=1, the location of the next unitpulse is performed employing a higher precision inner loop, representingenloop_(y) and corr_(xy) ² with 32 bit words in this example. That is,when highprecision_(active) is TRUE, the location of the next unit pulsein y(n) is determined by iterating over the n=0, . . . , N−1 possiblepositions, using equations Eq 24, Eq 6 and Eq 25.

$\begin{matrix}{{{corr}_{{xy}\; 32}\left( {k,n} \right)}^{2} = \left( {\left( {{{corr}_{xy}\left( {pulse}_{tot} \right)} + {2 \cdot \left( {{1 \cdot x}\; {{abs}(n)}} \right)}} \right) \cdot 2^{{corr}_{upshift}}} \right)^{2}} & \left( {{Eq}.\mspace{11mu} 24} \right) \\{\left. \begin{matrix}{n_{best} = n} \\{{bestCorrSq}_{32} = {{corr}_{{xy}\; 32}\left( {k,n} \right)}^{2}} \\{{bestEn}_{32} = {{enloop}_{y}\left( {k,n} \right)}}\end{matrix} \right\},{{{if}\mspace{14mu} {{{corr}_{{xy}\; 32}\left( {k,n} \right)}^{2} \cdot {bestEn}_{32}}} > {{bestCorrSq}_{32} \cdot {{enloop}_{y}\left( {k,n} \right)}}}} & \left( {{Eq}.\mspace{11mu} 25} \right)\end{matrix}$

In other words, en_margin is indicative of how many upshifts that can beused to normalize the energy in the next loop. If 16 or more upshiftscan be used, then the energy stays in the lower word length, assuming16/32 bit word lengths, and there is no need for the high precision (32bit representation) loop, so highprecision_(active) is set to FALSE. Oneimplementation reason for doing it in this way (allowing the energyinformation to stay in the low part of the L_energy 32 bit word) is thatit is computationally cheaper: it costs only 1 cycle to computeextract_I(L_energy) whereas an alternativeround_fx(L_shl(L_energy,en_margin)) takes two cycles.

When the best position n_(best) of the unit pulse has been determined,the accumulated correlation corr_(xy)(k), the accumulated inloop energyenloop_(y)(k) and the number of accumulated unit pulses pulse_(tot) areupdated. Further, the maximum amplitude maxamp_(y) in the best integervector y so far, is kept up to date, i.e. determined, for the next unitpulse addition loop.

maxamp_(y)=max(maxamp_(y) , y[n _(best)])   (Eq. 26)

If there are further unit pulses to add, i.e. when pulse_(tot)<K, a newinner-loop is started with a new near optimal corr_(upshift) analysisEq.19 and a new energy precision analysis Eq 22 and Eq 23, and thencommencing the next unit pulse loop with equations Eq.24, Eq.6 and Eq.26.

The high precision approach (in this example 32 bit words) worst casecomplexity for each unit pulse added to y(n) is roughly 7/N+31 cyclesper quantized coefficent.

The effect of the in-loop accumulated energy based inner loop precisionselection is that target sub vectors that have a high peakiness, or havevery fine granularity, i.e. the final K is high, will be using thehigher precision loop and more cycles more often, while non-peaky or lowpulse granularity sub vectors will more often use the lower precisionloop and fewer cycles.

It should be noted that the analysis described in the above sectioncould be performed also when K<t_(p). However, an embodiment may be mademore efficient by the introduction of a threshold t_(p) for applying theabove analysis.

PVQ Vector Finalization and Normalization

After shape search, each non-zero PVQ-vector element is assigned itsproper sign and the vector is L2-normalized (a.k.a. Euclideannormalization) to unit energy. Additionally, if the band was split, itis further scaled with a sub vector gain.

if(y(n)>0)∩(x(n)<0)

y(n)=−y(n), for n=0, . . . , N−1   (Eq. 27)

$\begin{matrix}{{norm}_{gain} = \frac{1}{\sqrt{y^{T}y}}} & \left( {{Eq}.\mspace{11mu} 28} \right) \\{{{x_{q}(n)} = {{norm}_{gain} \cdot {y(n)}}},{{{for}\mspace{14mu} n} = 0},\ldots \;,{N - 1}} & \left( {{Eq}.\mspace{11mu} 29} \right)\end{matrix}$

Above, two precision methodologies were presented and specified:

-   “En16×CorrSq16”, as defined in section above, (equations 19    thorough 21) and “En32×CorrSq32”, (equations 22 through 26). Two    further medium complexity methods where the precision of the    numerator Correlation Squared term and the Energy term are varied    are described below.

“En16×CorrSq32” and “En32×CorrSq16” Methods

The “En16×CorrSq32” method is similar to the “En32×CorrSq32”, but withthe difference that the inner loop best found unit pulse update andcomparison uses a 16 bit representation of the best Energy bestEn₁₆ sofar, according to:

$\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestCorrSq}_{32} = {{corr}_{{xy}\; 32}\left( {k,n} \right)}^{2}} \\{{bestEn}_{16} = {{enloop}_{y}\left( {k,n} \right)}}\end{matrix} \right\},{{{if}\mspace{14mu} {{{corr}_{{xy}\; 32}\left( {k,n} \right)}^{2} \cdot {bestEn}_{16}}} > {{bestCorrSq}_{32} \cdot {{enloop}_{y}\left( {k,n} \right)}}}} & \left( {{Eq}.\mspace{11mu} 30} \right)\end{matrix}$

The approximate cost of the “En16×CorrSq32” method per unit pulse is5/N+21 cycles.

The “En32×CorrSq16” method is similar to the “En32×CorrSq32”, but withthe difference that the inner loop best found unit pulse update andcomparison uses a 16 bit representation of the best squared correlationbestCorrSq₁₆ so far, according to:

$\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestCorrSq}_{16} = {{corr}_{{xy}\; 16}\left( {k,n} \right)}^{2}} \\{{{bestEn}\; 32} = {{enloop}_{y}\left( {k,n} \right)}}\end{matrix} \right\},{{{if}\mspace{14mu} {{{corr}_{{xy}\; 16}\left( {k,n} \right)}^{2} \cdot {bestEn}}\; 32} > {{bestCorrSq}\; {16 \cdot {{enloop}_{y}\left( {k,n} \right)}}}}} & \left( {{Eq}.\mspace{11mu} 31} \right)\end{matrix}$

The approximate cost of the “En32×CorrSq16” method per unit pulseaddition is 6/N+20 cycles per coefficient.

Aspects and Exemplifying Embodiments

Below, some exemplifying embodiments of the solution disclosed hereinwill be described with reference to FIGS. 1-4.

FIG. 1 is a flow chart illustrating a method concerning the fine searchof PVQ shape search. The method is intended to be performed by anencoder, such as a media encoder for audio and/or video signals. The PVQtakes a target shape vector x as input, and derives a vector y byiteratively adding unit pulses in an inner dimension search loop. Themethod relates to a pre-analysis, which is done before entering theinner loop. An output vector x_(q) will then be derived based on thevector y, as previously described. However, the forming of x_(q) is notcentral to the solution described herein, and will therefore not befurther described here.

In the method illustrated in FIG. 1, it may be assumed that the encoderkeeps track of a value maxamp_(y) of a current vector y. By “currentvector y” is here meant the vector y composed, found or constructed sofar, i.e. for a k<K. As previously described, a starting point for thevector y may be a projection to a surface below the K^(th) pyramid, oran empty all-zero vector. The method illustrated in FIG. 1 comprises,before entering a next inner dimension search loop for unit pulseaddition, determining 101, based on the maximum pulse amplitude,maxamp_(y), of a current vector y, whether more than a current bit wordlength is needed to represent enloop_(y), in a lossless manner in theupcoming inner dimension loop. The variable enloop_(y) is related to anaccumulated energy of the vector y. The performing of this methodenables the encoder to keep the complexity of the search at a reasonablelevel. For example, it enables the encoder to apply an increasedprecision (implying higher complexity) loop only when it may be needed,by analyzing whether the “worst case scenario” in the upcoming innerloop would require an inner loop with a higher precision than thecurrently used.

The pre-analysis described above is performed before each entry 102 tothe inner loop, i.e. before each addition of a unit pulse to the vectory. In an exemplifying embodiment where only two different bitrepresentations, i.e. bit word lengths such as 16 and 32 bits, areavailable, the inner loop will be performed using a 16 bitrepresentation of enloop_(y) until it is determined that a longer bitword is needed to represent enloop_(y), after which the higher bit wordlength, i.e. the 32 bit representation will be applied for the innerloop calculations. The loop using a 16 bit representation may bereferred to as a “low precision loop”, and the loop using a 32 bitrepresentation may be referred to as a “high precision loop”.

The determining 102 of whether more than an initial or current bit wordlength is needed could alternatively be expressed as that it isdetermined which bit word length, out of at least two different,alternative, bit word lengths, that will be required for representingthe “worst case” (largest possible increase) enloop_(y) during the nextinner loop. The at least two different word bit lengths could compriseat least e.g. 16 and 32 bit word lengths.

In other words, when more than a current bit word length is determined102 to be needed to represent enloop_(y) in the next inner loop, theinner loop calculations are performed 103 with a longer bit word length,than an initial or current bit word length, for representing enloop_(y)in the inner loop. On the other hand, when more than a current bit wordlength is determined not to be needed to represent enloop_(y), the innerloop calculations may be performed by employing a first unit pulseaddition loop using a first or current bit word length to representenloop_(y), i.e the current bit word length may continue to be used.This is also illustrated e.g. in FIG. 4, as the use of two differentloops. FIG. 4 shows one, low precision inner loop, which is run 405 whenit is determined 403 that it is sufficient with a current (lower) bitword length; and one high precision inner loop, which is run when it isdetermined 403 that a higher bit word length is needed to represent theenergy in the inner loop, in order not to lose information.

The method builds on the realization that the maximum possible increaseof an energy variable, such as enloop_(y), in a next inner loop willoccur when the unit pulse is added to the position in y associated withthe current maxamp_(y). Having realized this, it is possible todetermine, before entering the inner loop, whether there is a risk forexceeding the representation capacity of the currently used bit wordlength, e.g. 16 bits, during the next inner loop, or not. In otherwords, the determining of whether more than a current bit word length isneeded to represent enloop_(y) comprises determining characteristics ofthe case when, in the upcoming inner search loop, the unit pulse isadded to the position in y being associated with maxamp_(y). Forexample, the number of bits needed to represent enloop_(y) in theupcoming inner loop may be determined, or, alternatively, a remainingmargin in a bit word representing enloop_(y) in the upcoming inner loop.

For target shape vectors being associated with a low K, it is possibleto say in advance that there will be no need for a longer bit wordlength than the one offered by the initial and currently used bit wordlength. Therefore, it would be possible to apply a threshold value Tk,such that certain operations are performed only for target shape vectorsbeing associated with a K which exceeds the threshold value Tk. For suchtarget vectors, the encoder will keep track of maxamp_(y), by updatingthis value after each pulse addition. For target vectors associated witha K which is lower than the threshold value, it is not necessary to keeptrack of maxamp_(y). For the example with 16 and 32 bit words, apossible Tk would be 127, as previously described. In other words, thekeeping track of maxamp_(y) and the determining of whether more than acurrent bit word length is needed is performed, e.g., only when a finalvalue of K associated with the input target shape vector exceeds athreshold value Tk.

An embodiment illustrated in FIG. 2 comprises keeping track of ordetermining 201 maxamp_(y), and determining 202 xabs_(max). The value ofmaxampy may be changed when adding a new unit pulse in the inner loopand therefore maxampy needs to be updated, in order to be kept up todate after each loop. For example, the action 201 may comprise keepingtrack of maxampy until a value of k has reached a threshold value wherethe initial or current bit word length used for representing enloopy mayno longer be sufficient, and the analysis represented e.g. by action 204is commenced. The updating of maxampy after an inner loop following theanalysis of e.g. action 204 is illustrated as action 206 in FIG. 2. Itshould be noted, however, that xabsmax is not changed in the process,and therefore only needs to be determined 202 once. As illustrated inFIG. 2, an embodiment of the method could also comprise, before entering205 a next inner dimension loop for unit pulse addition, determining203, based on a maximum absolute value, xabs_(max), of the input shapevector, x, a possible upshift, in a bit word, of the next loop'saccumulated in-loop correlation value, corr_(xy), between x and thevector y. The upshift could also be denoted an upscaling. Equation 19above illustrates the determining of the maximum possible upshift. Byperforming this, it may be ensured that as many correlation informationbits as possible are maintained during the inner loop evaluation,especially the most significant ones. It should be noted here that thecorrelation value corr_(xy), in form of corr_(xy) ², need notnecessarily be represented in a lossless manner. The determining of themaximum upshift may be performed in a “Ionger” bit word, irrespective ofthe current bit word length used in the inner loop. That is, the maximumpossible upshift may be determined for a 32 bit word even when a 16 bitword will be used in the inner loop. When a shorter bit word is to beused in the inner loop, the determined upshift will then be “rounded” tothe shorter bit word, as illustrated by Eq 20. Note that the correlationvalue, corr_(xy), always is less than or equal to one(1.0) in theapplied DSP precision for the correlation value, i.e. corr_(xy)≦1.0, andtherefore, the maximum upshift determined for corr_(xy) is also validfor corr_(xy) ².

When more than a current bit word length is determined to be needed torepresent enloop_(y), the inner loop calculations may be performed usinga longer bit word length (than the current bit word length, e.g. 32instead of 16 bits) to represent enloop_(y).

In one embodiment, when more than a current bit word length isdetermined to be needed to represent enloop_(y), the inner loopcalculations are performed with a longer bit word length (than thecurrent bit word length), representing also an accumulated in-loopcorrelation value, corr_(xy) ², in the inner loop. This is illustratede.g. in FIG. 3, in action 305. That is, a bit word length determined forthe energy value enloop_(y) may also be applied for corr_(xy) ².

As previously mentioned, it is preferred to avoid performing thedivision of Eq 8 in the inner dimension search loop for unit pulseaddition. Therefore, a cross-multiplication may be performed, asillustrated in Eq 10. That is, a position, n_(best), in y for additionof a unit pulse, could be determined by evaluating across-multiplication, for each position n in y, of a correlation andenergy value for the current n; and a “best so far” correlation,BestCorrSq, and a “best so far” energy value bestEn, saved from previousvalues of n, as:

corr_(xy) ²*bestEn>BestCorrSq*enloop_(y)

where

$\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestEn} = {enloop}_{y}} \\{{BestCorrSq} = {corr}_{xy}^{2}}\end{matrix} \right\},{{{when}\mspace{14mu} {{corr}_{{xy}\;}^{2} \cdot {bestEn}}} > {{BestCorrSq}*{enloop}_{y}}}} & \;\end{matrix}$

The position n_(best) could be referred to as a “best” position in y foraddition of a unit pulse. It should be noted that “≧” could be used inthe expressions above instead of “>”. However, “>”, i.e. “larger than”may be preferred when trying to keep the computational cost as low aspossible, e.g. in regard of number of cycles. The performing of themethod according to any of the embodiments described above enables thiscross-multiplication to be performed in an efficient manner (e.g. by notusing a higher precision than actually needed).

Implementations

The methods and techniques described above may be implemented in anencoder or codec, which may be comprised in e.g. in a communicationdevice.

Encoder, FIGS. 11a-11c

An exemplifying embodiment of an encoder is illustrated in a generalmanner in FIG. 11a . The encoder may be a media encoder, configured forencoding of e.g. audio and/or video signals. The encoder 1100 isconfigured to perform at least one of the method embodiments describedabove with reference to any of FIGS. 1-5. The encoder 1100 is associatedwith the same technical features, objects and advantages as thepreviously described method embodiments. In some implementations, theencoder is associated with constraints in regard memory and/orcomplexity, such as e.g. when the encoder is configured with a fixedprecision DSP. The encoder will be described in brief in order to avoidunnecessary repetition.

The encoder may be implemented and/or described as follows: The encoder1100 is configured for Pyramid Vector Quantization, including so-calledfine search or fine shape search, where a Pyramid Vector Quantizer, PVQ,is configured to take a target vector x as input and derives a vector yby iteratively adding unit pulses in an inner dimension search loop. Theinput vector x has a dimension N and an L1-norm K. The encoder 1100comprises processing circuitry, or processing means 1101 and acommunication interface 1102. The processing circuitry 1101 isconfigured to cause the encoder 1100 to, before entering a next innerdimension search loop for unit pulse addition: determine, based on amaximum pulse amplitude, maxamp_(y), of a current vector y, whether morethan a current bit word length is needed to represent, in a losslessmanner, a variable, enloop_(y), related to an accumulated energy of y,in the upcoming inner dimension loop. The communication interface 1102,which may also be denoted e.g. Input/Output (I/O) interface, includes aninterface for sending data to and receiving data from other entities ormodules.

The processing circuitry 1101 could, as illustrated in FIG. 11b ,comprise processing means, such as a processor 1103, e.g. a CPU, and amemory 1104 for storing or holding instructions. The memory would thencomprise instructions, e.g. in form of a computer program 1105, whichwhen executed by the processing means 1103 causes the encoder 1100 toperform the actions described above.

An alternative implementation of the processing circuitry 1101 is shownin FIG. 11c . The processing circuitry here comprises a determining unit1106, configured to cause the encoder 1100 to, before entering a nextinner dimension search loop for unit pulse addition: determine, based ona maximum pulse amplitude, maxamp_(y), of a current vector y, whether ahigher precision than allowed with a current bit word length is neededto represent, in a lossless manner, a variable, enloop_(y), related toan accumulated energy of y, in the upcoming inner dimension loop. Theprocessing circuitry 1101 could comprise more units, such as a finesearch unit 1107, configured to cause the encoder to run an innerdimension loop with a certain bit word length and/or a certainprecision.

The encoders described above could be configured for the differentmethod embodiments described herein, such as e.g. to perform the innerloop calculations using a longer bit word representing enloop_(y) andpossibly corr_(xy) ², when more than a current bit word length isdetermined to be needed to represent enloop_(y). “Longer”, here refersto longer than a current or initial bit word length.

The encoder 1100 may be assumed to comprise further functionality, forcarrying out regular encoder functions.

The encoder described above may be comprised in a device, such as acommunication device. The communication device may be a user equipment(UE) in the form of a mobile phone, video camera, sound recorder,tablet, desktop, laptop, TV set-top box or home server/home gateway/homeaccess point/home router. The communication device may in someembodiments be a communications network device adapted for coding and/ortranscoding. Examples of such communications network devices areservers, such as media servers, application servers, routers, gatewaysand radio base stations. The communication device may also be adapted tobe positioned in, i.e. being embedded in, a vessel, such as a ship,flying drone, airplane and a road vehicle, such as a car, bus or lorry.Such an embedded device would typically belong to a vehicle telematicsunit or vehicle infotainment system.

The steps, functions, procedures, modules, units and/or blocks describedherein may be implemented in hardware using any conventional technology,such as discrete circuit or integrated circuit technology, includingboth general-purpose electronic circuitry and application-specificcircuitry.

Particular examples include one or more suitably configured digitalsignal processors and other known electronic circuits, e.g. discretelogic gates interconnected to perform a specialized function, orApplication Specific Integrated Circuits (ASICs).

Alternatively, at least some of the steps, functions, procedures,modules, units and/or blocks described above may be implemented insoftware such as a computer program for execution by suitable processingcircuitry including one or more processing units. The software could becarried by a carrier, such as an electronic signal, an optical signal, aradio signal, or a computer readable storage medium before and/or duringthe use of the computer program in the communication device.

The flow diagram or diagrams presented herein may be regarded as acomputer flow diagram or diagrams, when performed by one or moreprocessors. A corresponding apparatus may be defined by a group offunction modules, where each step performed by the processor correspondsto a function module. In this case, the function modules are implementedas a computer program running on the processor. It is to be understoodthat the function modules do not have to correspond to actual softwaremodules.

Examples of processing circuitry includes, but is not limited to, one ormore microprocessors, one or more Digital Signal Processors, DSPs, oneor more Central Processing Units, CPUs, and/or any suitable programmablelogic circuitry such as one or more Field Programmable Gate Arrays,FPGAs, or one or more Programmable Logic Controllers, PLCs. That is, theunits or modules in the arrangements in the different devices describedabove could be implemented by a combination of analog and digitalcircuits, and/or one or more processors configured with software and/orfirmware, e.g. stored in a memory. One or more of these processors, aswell as the other digital hardware, may be included in a singleapplication-specific integrated circuitry, ASIC, or several processorsand various digital hardware may be distributed among several separatecomponents, whether individually packaged or assembled into asystem-on-a-chip, SoC.

It should also be understood that it may be possible to re-use thegeneral processing capabilities of any conventional device or unit inwhich the proposed technology is implemented. It may also be possible tore-use existing software, e.g. by reprogramming of the existing softwareor by adding new software components.

Further Exemplifying Embodiments

Expressed in a slightly different manner, the disclosure herein relatesto, for example, the following aspects and embodiments.

One of the aspects is an encoder/codec, wherein the encoder/codec isconfigured to perform one, more than one or even all of the followingsteps, illustrated e.g. in FIGS. 5-6:

-   -   determining, calculating or obtaining a maximum absolute value        (xabs_(max)) of an input target vector (x(n)), e.g. according to        equations 11 and 12 above and as illustrated e.g. with step S1        in FIG. 5 in one embodiment,    -   determining, calculating or obtaining a possible upshift of a        correlation value based at least on the maximum absolute value        (xabs_(max)), e.g. by calculating the possible upshift of a next        loop's accumulated in-loop correlation value in a signed 32-bit        word through the equation 19 below and illustrated with step S2        in FIG. 5 in one embodiment,    -   if the number of final unit pulses (K) will end up higher than a        threshold (t_(p)), which for example may be 127 unit pulses,        determine, e.g. keep track of/store, a maximum pulse amplitude        (maxamp_(y)) value/information calculated e.g. according to        equation 26 above of a vector (y(n)), which may be defined        according to equations 13 and 14 above, and        -   determining/calculating/deciding/selecting based on the            stored maximum pulse amplitude, e.g. through a calculation            in accordance with equations 22 and 23 below and as            illustrated by step S3 in FIG. 6, if more than a certain            word length is needed or should be used, e.g. more than a            signed 16 bit word or more than a signed 32 bit word, to            represent in-loop energy without losing, or substantially            losing, any energy information,        -   representing a best squared correlation term/parameter/value            and a best accumulated energy term/parameter/value by more            than the certain word length, e.g. 32 bit words or 64 bit            words, if more than the certain word length is needed, and        -   if less than the certain word length is needed, running a            first loop,        -   if more than the certain word length is needed, running a            second, alternative loop with the “best so far” (near            optimal) accumulated energy term and best squared            correlation term represented by the more than the certain            word length words.

The second loop may be a higher precision and computationally moreintensive high precision unit pulse loop than the lower precision (i.e.in relation to the second loop) first loop. The inloop accumulatedenergy based selection of the inner loop precision has the effect thattarget sub vectors that have a high peakiness, or have very finegranularity (final K is high) will or could be using the higherprecision loop and more cycles more often, while non-peaky or low pulsegranularity sub vectors will or could more often use the lower precisionloop and fewer cycles.

One aspect relates to a communication device 1, illustrated in FIG. 9,which comprises an encoder or codec 2 for video or audio coding, e.g. anEVS encoder.

The encoder or codec may be fully or partially implemented as a DSPpositioned in the communication device. In one first embodiment theencoder/codec is configured to make a PVQ-shape search based on a targetsub vector (x(n)), the number of finite unit pulses (K), a sub vectordimension value (N) of the target sub vector and optionally also one ormore gain values (g_(sub)). The encoder or codec may also be configuredto make a PVQ band split, and in such a case the PVQ-shape search wouldalso be based on a number/value of sub vectors of a band (N_(S)) and alargest gain of a gain vector G, (g_(max)=max (G)=max (g_(o) . . .g_((Ns−1))). The encoder or codec is further configured to output fromthe PVQ-shape search an integer vector (y) and/or a shape sub vectorx_(q)(n) to be used by the encoder for PVQ indexing. The integer vector(y) comprises element values and has the same length as the sub vectordimension value (N) and an absolute sum of all the element values isequal to the number of unit pulses (K).

The encoder/codec/communication device is configured to perform thePVQ-shape search, wherein the encoder/codec/communication device isconfigured to:

-   -   determine, calculate or obtain (S1, S23) a maximum absolute        value (xabs_(max)) of the input (target) vector (x(n)), e.g.        according to equations 11 and 12 above,    -   determine, calculate or obtain (S2,S28) a possible upshift of a        correlation value based at least on the maximum absolute value        (xabs_(max)), e.g. by calculating the possible upshift of a next        loop's accumulated in-loop correlation value in a signed 32-bit        word through the equation 19 above,    -   if the number of final unit pulses (K) will end up higher than a        threshold (t_(p)), which for example may be 127 unit pulses,        keep track of/store (S30) a maximum pulse amplitude (maxamp_(y))        value/information calculated e.g. according to equation 26 above        of a vector (y(n)), which may be defined according to equations        13 and 14 above, and        -   determine/calculate/decide/select (S3, S32) based on the            stored maximum pulse amplitude, e.g. through a calculation            in accordance with equations 22 and 23 above, if more than a            certain word length is needed or should be used, e.g. more            than a signed 16 bit word or more than a signed 32 bit word,            to represent in-loop energy,        -   represent (S34) a best squared correlation            term/parameter/value and a best accumulated energy            term/parameter/value by more than the certain word length,            e.g. 32 bit words or 64 bit words, if more than the certain            word length is needed, and        -   if less than the certain word length is determined, run            (S33) a first loop,        -   if more than the certain word length is determined, run            (S35) a second, alternative loop with the best accumulated            energy term and best squared correlation term represented by            the more than the certain word length words.

The above PVQ-shape search, which may be a limited precision PVQ-shapesearch, is in one embodiment performed by a vector quantizer, which is apart of the encoder/codec and may be implemented at least partly, butalso fully as a DSP unit, which may be positioned in or adapted to bepositioned in a communication device. Thus the encoder/codec may befully or partly implemented as a hardware unit, e.g. a DSP or aprogrammable-field gate array (FPGA). It may however in alternativeembodiments be implemented with the help of a general purpose processorand a codec computer program which when run on the general purposeprocessor causes the communication device to perform one or more of thesteps mentioned in the paragraph above. The processor may also be aReduced Instruction Set Computing (RISC) processor.

Another aspect of the disclosure herein is, as indicated in theparagraph above, a computer program 6 illustrated in FIG. 10 and whereone embodiment is fully disclosed by the ANSI-C code example in appendix1 below., such as an encoder computer program or codec computer program,comprising computer readable code, which when run on aprocessor/processor unit 4 of a communication device 1 causes thecommunication device to perform one or more of the steps mentioned inconjunction with the method in the paragraph below or any of the stepsmentioned in conjunction with FIG. 7.

Yet another aspect is a PVQ-shape search method performed by acommunication device/codec/encoder, wherein the method comprises one ormore of the following steps:

-   -   determining, calculating or obtaining (S1) a maximum absolute        value (xabs_(max)) of the input (target) vector (x(n)), e.g.        according to equations 11 and 12 above,    -   determining, calculating or obtaining (S2, S28) a possible        upshift of a correlation value based at least on the maximum        absolute value (xabs_(max)), e.g. by calculating the possible        upshift of a next loop's accumulated in-loop correlation value        in a signed 32-bit word through the equation 19 above,    -   if the number of final unit pulses (K) will end up higher than a        threshold (t_(p)), which for example may be 127 unit pulses,        keep track of/store a maximum pulse amplitude (maxamp_(y))        value/information calculated e.g. according to equation 26 above        of a vector (y(n)), which may be defined according to equations        13 and 14 above, and        -   determining/calculating/deciding/selecting (S3) based on the            stored maximum pulse amplitude, e.g. through a calculation            in accordance with equations 22 and 23 above, if more than a            certain word length is needed or should be used, e.g. more            than a signed 16 bit word or more than a signed 32 bit word,            to represent in-loop energy,        -   representing a best squared correlation term/parameter/value            and a best accumulated energy term/parameter/value by more            than the certain word length, e.g. 32 bit words or 64 bit            words, if more than the certain word length is needed, and        -   if less than the certain word length is determined, running            a first loop,        -   if more than the certain word length is determined, running            a second, alternative loop with the best accumulated energy            term and best squared correlation term represented by the            more than the certain word length words.

The communication device may be a user equipment (UE) in the form of amobile phone, video camera, sound recorder, tablet, desktop, laptop, TVset-top box or home server/home gateway/home access point/home router,etc. as defined above.

Still another aspect is a computer readable storage medium 5 (see FIG.10) on which any of the above embodiments of the computer program isstored. The computer readable storage medium may be in the form of avolatile or non-volatile memory, e.g. an EEPROM (Electrically ErasablePROM), FPGA, a flash memory (including Solid-state drive), and a harddrive.

An embodiment of the communication device 1 is illustrated in FIG. 9.The communication device comprises, for the performance of a PVQ-shapesearch, one, more than one or all of the following units:

-   -   a first determining unit, U1, for determining, calculating or        obtaining a maximum absolute value (xabs_(max)) of the input        (target) vector (x(n)), e.g. according to equations 11 and 12        above,    -   a second determining unit, U2, for determining, calculating or        obtaining a possible upshift of a correlation value based at        least on the maximum absolute value (xabs_(max)), e.g. by        calculating the possible upshift of a next loop's accumulated        in-loop correlation value in a signed 32-bit word through the        equation 19 above,    -   a storing unit, U3, for keeping track of/store a maximum pulse        amplitude (maxamp_(y)) value/information calculated e.g.        according to equation 26 above of a vector (y(n)), which may be        defined according to equations 13 and 14 above, if the number of        final unit pulses (K) will end up higher than a threshold        (t_(p)),    -   a selection unit, U4 for        determining/calculating/deciding/selecting based on the stored        maximum pulse amplitude, e.g. through a calculation in        accordance with equations 22 and 23 above, if more than a        certain word length is needed or should be used, e.g. more than        a signed 16 bit word or more than a signed 32 bit word, to        represent in-loop energy,    -   a representation unit, U5, for generating a best squared        correlation term/parameter/value and a best accumulated energy        term/parameter/value with a word length, e.g. 32 bit words or 64        bit words, being more than the certain word length if more than        the certain word length is selected by the selection unit, and    -   an inner loop processing unit, U6, for        -   running a first loop, if less than the certain word length            is selected by the selection unit, and        -   running a second, alternative loop with the best accumulated            energy term and best squared correlation term represented by            the more than the certain word length words, if more than            the certain word length is determined.

The units mentioned in the paragraph above may be comprised in acodec/encoder 2 in the form of a DSP in the communication unit and mayfurthermore be comprised in a hardware vector quantizer of the DSP. Inan alternative embodiment, all the units in the paragraph above areimplemented in the communication device as software.

As further illustrated in FIG. 9, the communication device 1 may alsocomprise further units related to the encoder/codec and in particularunits related to vector quantization and PVQ-shape searching. Such unitsare configured to enable shape searches according to the description andfigs comprised in this application. Exemplary units illustrated in FIG.9 are:

-   -   a PVQ band split unit U7 for performing the optional step S21        described in conjunction with FIG. 7,    -   a comparison unit U8 for performing the step S24 described in        conjunction with FIG. 7 below,    -   a PVQ vector generating unit U9 for performing the step S25        described below,    -   a starting point generating unit U10 for performing the step S26        described below,    -   a parameter calculating unit U11 for performing the step S27        described below,    -   a bit allocation unit U12 for e.g. supplying K and N to the        shape search, and    -   a PVQ indexing unit U13, which can be seen as a receiver of the        output from the PVQ-shape search disclosed herein,    -   a normalization unit U14 for performing step S36 described        below, and    -   an output unit U15 for performing step S37 described below.

In the case of a software implementation in a communication device, anembodiment of the communication device 1 may be defined as acommunication device comprising a processor 4 and a computer programstorage product 5 in the form of a memory, said memory containinginstructions executable by said processor, whereby said communicationdevice is operative to perform one, more than one or all of thefollowing:

-   -   determine, calculate or obtain a maximum absolute value        (xabs_(max)) of an input (target) vector (x(n)), e.g. according        to equations 11 and 12 above,    -   determine, calculate or obtain a possible upshift of a        correlation value based at least on the maximum absolute value        (xabs_(max)), e.g. by calculating the possible upshift of a next        loop's accumulated in-loop correlation value in a signed 32-bit        word through the equation 19 above,    -   if the number of final unit pulses (K) will end up higher than a        threshold (t_(p)), which for example may be 127 unit pulses,        keep track of/store a maximum pulse amplitude (maxamp_(y))        value/information calculated e.g. according to equation 26 above        of a vector (y(n)), which may be defined according to equations        13 and 14 above, and        -   determine/calculate/decide/select based on the stored            maximum pulse amplitude, e.g. through a calculation in            accordance with equations 22 and 23 above, if more than a            certain word length is needed or should be used, e.g. more            than a signed 16 bit word or more than a signed 32 bit word,            to represent in-loop energy,        -   represent a best squared correlation term/parameter/value            and a best accumulated energy term/parameter/value by more            than the certain word length, e.g. 32 bit words or 64 bit            words, if more than the certain word length is needed, and        -   if less than the certain word length is determined, run a            first loop,        -   if more than the certain word length is determined, run a            second, additional loop with the best accumulated energy            term and best squared correlation term represented by the            more than the certain word length words.

To further illustrate aspects and embodiments, some of them are in thefollowing going to be described in conjunction with FIGS. 7-8.

FIG. 8 provides an overview of a transmitting side of the emerging 3GPPEVS, including an EVS encoder 3, which here is comprised in thecommunication device 1.

FIG. 7 illustrates some method steps in an alternative way of describingsome embodiments in relation to the embodiments illustrated in FIG. 5-6.Even though some of the steps mentioned with respect to FIG. 7 can besaid to be made in conjunction with a PVQ-shape search, it should alsobe apparent that some of the steps could also be said to be performedbefore the PVQ-shape search. In an optional first step S21, a PVQ bandsplit is performed.

Shape target sub vectors, optionally from step S21, are received in asecond step S22, wherein, in dependence of embodiment, also g_(sub),g_(max) and N_(s) may be received.

In a third step S23, which corresponds to step S1 in FIG. 5, a maxabsolute value of a target vector is determined, e.g. by firstcalculating the absolute value of the sub vector x(n) of the targetvector and then selecting the largest absolute value of the sub vector.

In an optional fourth step S24, it is determined whether the value ofthe target vector is equal to or below a first threshold. The thresholdis set to “filter out” target vectors which are considered to have verylow energy values. As explained above, the threshold could be set to beequal to zero in one embodiment. It could in this fourth step also bedecided if a sub vector gain is equal to or below a second threshold. Inone embodiment the second threshold is set to zero, but may in otherembodiments be set to be the Machine Epsilon in dependence of theprecision used for processed words.

If it in the fourth step S24 is determined that the target vector isequal or below the first threshold and/or the sub vector gain is belowor equal to the second threshold, then a PVQ-vector is created in anoptional fifth step S25. The creation is in one embodimentdeterministically created by assigning half of the K unit pulses to afirst position

$\left( {{y\lbrack 0\rbrack} = \left\lfloor \frac{K}{2} \right\rfloor} \right),$

and the remaining unit pulses to a last position(y[N−1]=y[N−1]+(K−y[0])). This step could in conjunction with the fourthstep S24 be seen as bypassing the whole actual PVQ-shape search, but canalso be seen as a sub-routine within the context of a general PVQ-shapesearch procedure.

In an optional sixth step S26, an initial value (starting point) for y,y_start, is set for the PVQ-shape search to follow, wherein the initialvalue is dependent on the ratio between K and N. If the ratio is largerthan a third threshold value, which may be 0.5 unit pulses percoefficient, a first projection to a K−1 sub pyramid is used as theinitial vector_y_start in a following step. The first projection may becalculated as in equations 13 and 14 above. If lower than the thirdthreshold, then the initial vector y_start is decided to start off from0 pre-placed unit pulses.

In preparation for subsequent PVQ-shape search steps, all the initialvector values in y_start is set to zero in a seventh step S27. In thisstep a first parameter, here called the accumulated number of unitpulses, pulse_(tot), and a second parameter, here the accumulatedcorrelation, corr_(xy)(pulse_(tot), and a third parameter, here calledthe accumulated energy energy_(y)(pulse_(tot)) for the starting pointare computed, e.g. according to equations 15-17 respectively. A fourthparameter, here called enloop_(y) (pulse_(tot)) may also be calculatedin this step according to equation 18 above.

In an eighth step S28, a PVQ-shape search is started, or in analternative way of looking at it, the second, fine search part of thePVQ-shape search is started for remaining unit pulses up to K with thehelp of previously obtained, determined or calculated K, N, X_abs,max_xabs, and y, and in some embodiments also g_(sub), g_(max) andN_(S). Detailed steps of some embodiments of this fine search arethoroughly illustrated by e.g. FIG. 6, but it could be emphasized thatin some embodiments the fine search comprises a determination of a fifthparameter/value, here called an upshift of a correlation value,corr_(upshift), is calculated for at least some, and in some embodimentsall, the unit pulses for which the fine search or inner loop is done. Insome embodiments a possible upshift of a next loop's accumulated in-loopcorrelation value in a signed 32-bit word is calculated based onequation 19 above and corr_(upshift) is then used as input to acalculation of a correlation value corr_(xy) in equation 20.

In a ninth step S29, which may be said to be a part of the finePVQ-shape search, it is determined whether the number of final unitpulses K will end up higher than a third threshold, t_(p), for thenumber of final unit pulses. If this is the case, then in a tenth stepS30, the maximum pulse amplitude maxamp_(y) is stored.

In an eleventh step S31, a sixth parameter, en_(margin) , is calculatedaccording to e.g. equation 22.

In a twelfth step S32, the sixth parameter is compared with fourththreshold value, which corresponds to a certain word length.

If the answer YES (in S32 FIG. 7) or False (in Ansi-code example inAppendix 1) is, i.e. en_(margin) in the exemplary equation/decision 23is equal or larger than fourth threshold “16”, then in a step S33, afirst, faster and “coarser” loop is run than in a second loop of thefine search. Embodiments of the first loop are shown in e.g. FIG. 6.

If the answer is “No” (in S32 FIG. 7) or True (in Ansi-code example inAppendix 1), i.e. en_(margin) in the exemplary equation/decision 23 isless than “16”, then in a fourteenth step S34, a seventh parameter, thebest squared correlation term/parameter/value, and an eighth parameter,best accumulated energy term/parameter/value, are generated/transformedto become more than the certain word length, which then in a fifteenthstep S35 are used in the second, more detailed inner loop of the finesearch. Embodiments of the second loop are shown in more detail in e.g.FIG. 6.

In a sixteenth step S36, at least each non-zero PVQ-sub vector elementis assigned its proper sign and the vector is L2-normalized to unitenergy. If, in some embodiments, a band has been split, then it isscaled with a sub-vector gain g_(sub). A normalized x_(q) may also bedetermined based on equation 28. An exemplary procedure for this step ismore thoroughly described above.

In a seventeenth step S37, the normalized x_(q) and y are output fromthe PVQ-shape search process and forwarded to a PVQ-indexing processincluded in e.g. the codec.

Some Advantages of Embodiments and Aspects

Below are some advantages over prior art enabled at least some of theaspects and embodiments disclosed above.

The proposed correlation scaling method/algorithm using a pre-analysisof the current accumulated maximum correlation, improves the worst case(minimum) SNR performance of a limited precision PVQ-shape quantizationsearch implementation. The adaptive criterion for up-front correlationmargin analysis requires very marginal additional complexity. Further nocostly pre-normalization of the target vector x to e.g. unit energy isrequired.

The adaptive criterion using tracking of the maximum pulse amplitude inthe preliminary result , followed by a pre-analysis of the worst caseaccumulated energy, for e.g. the soft 16/32 bit precision inner-loopdecision requires very little additional computational complexity andprovides a good trade-off where the complexity may be kept low whilehigh precision correlation and high precision energy metrics are stillused for relevant input signals, and further subjectively importantpeaky signals will be assigned more precision. In other words, at leastsome of the embodiments and aspects improve the functioning of acomputer/processor itself.

In Tables 2/3 above in appendix 2 below, one can find that an examplePVQ-based system using the adaptive precision logic cost will be 6.843WMOPS, if one would use 32 bit energy and squared correlation precisionin all(any K) inner search loops the cost is raised to 10.474 WMOPS.

Concluding Remarks

The embodiments described above are merely given as examples, and itshould be understood that the proposed technology is not limitedthereto. It will be understood by those skilled in the art that variousmodifications, combinations and changes may be made to the embodimentswithout departing from the present scope. In particular, different partsolutions in the different embodiments can be combined in otherconfigurations, where technically possible.

When using the word “comprise” or “comprising” it shall be interpretedas non-limiting, i.e. meaning “consist at least of”.

It should also be noted that in some alternate implementations, thefunctions/acts noted in the blocks may occur out of the order noted inthe flowcharts. For example, two blocks shown in succession may in factbe executed substantially concurrently or the blocks may sometimes beexecuted in the reverse order, depending upon the functionality/actsinvolved. Moreover, the functionality of a given block of the flowchartsand/or block diagrams may be separated into multiple blocks and/or thefunctionality of two or more blocks of the flowcharts and/or blockdiagrams may be at least partially integrated. Finally, other blocks maybe added/inserted between the blocks that are illustrated, and/orblocks/operations may be omitted without departing from the scope ofinventive concepts.

It is to be understood that the choice of interacting units, as well asthe naming of the units within this disclosure are only for exemplifyingpurpose, and nodes suitable to execute any of the methods describedabove may be configured in a plurality of alternative ways in order tobe able to execute the suggested procedure actions.

It should also be noted that the units described in this disclosure areto be regarded as logical entities and not with necessity as separatephysical entities.

Reference to an element in the singular is not intended to mean “one andonly one” unless explicitly so stated, but rather “one or more.” Allstructural and functional equivalents to the elements of theabove-described embodiments that are known to those of ordinary skill inthe art are expressly incorporated herein by reference and are intendedto be encompassed hereby. Moreover, it is not necessary for a device ormethod to address each and every problem sought to be solved by thetechnology disclosed herein, for it to be encompassed hereby.

In some instances herein, detailed descriptions of well-known devices,circuits, and methods are omitted so as not to obscure the descriptionof the disclosed technology with unnecessary detail. All statementsherein reciting principles, aspects, and embodiments of the disclosedtechnology, as well as specific examples thereof, are intended toencompass both structural and functional equivalents thereof.Additionally, it is intended that such equivalents include bothcurrently known equivalents as well as equivalents developed in thefuture, e.g. any elements developed that perform the same function,regardless of structure.

ABBREVIATIONS

-   N vector dimension-   N_(s) sub-vector dimension-   x target vector-   x_(q) Quantized shape vector-   y_(final) integer vector adhering to the L1-norm K-   K Number of final unit pulses-   k number of accumulated unit pulses index-   n coefficient or sample index-   i sub vector index-   MDCT Modified Discrete Cosine Transform-   PVQ Pyramid Vector Quantizer (Quantization)-   WC Worst Case-   WMOPS Weighted Million Operations Per Second-   AccEn Accumulated Energy-   ROM Read Only Memory-   PROM Program ROM-   SNR Signal-to-Noise Ratio-   EVS Enhanced Voice Service-   3GPP 3^(rd) Generation Partnership Project-   DSP Digital Signal Processor-   CELT Constrained Energy Lapped Transform-   IETF Internet Engineering Task Force-   MAC Multiply-Accumulate-   ACELP Algebraic code-excited linear prediction-   EPS Machine epsilon

Appendix 1: Exemplary Implementation of Embodiment in ANSI-C Code

Below is an example of an implementation of an exemplifying embodimentin ANSI C-code using STL 2009 G.191 virtual 16/32 bit (a simulation of aDSP).

The above code should be easy to read for all persons skilled in the artand should not have to be explained more in detail. However, for thenon-skilled person it is mentioned that the relational operator “==” isan operator which in an example of “A==B” returns a logical value set tological 1 (true) when values A and B are equal; and otherwise returnslogical 0 (false). L_mac is a multiply-accumulate within the meaningthat L_mac (L_v3, v1, v2)=L_v3+v1*v2.

APPENDIX 2: Tabled Simulation Results

Simulation Background

Embodiments of the disclosure herein have been simulated. For allPVQ-shape-search simulations made, the bit rate used was 64000 bps, andthe codec was operated in MDCT mode, with initial MDCT coefficientsub-band sizes of [8, 12, 16, 24, 32] coefficients. These bands may verywell be divided into smaller band sections, each represented by a subvector, by a PVQ band splitting-algorithm. For example, a band of size 8may be split into smaller sub-section, e.g. “4, 4” or “3,3,2”, if it isallocated enough bits. Typically, each band is split in such a way thata maximum of 32 bits may be used for shape coding of every finalsub-vector.

In this PVQ-indexing implementation a band of size 8 may have up to 36unit pulses, a sub section of size 7 may have up to 53 unit pulses, asection of size 6 may have up to 95 unit pulses, a section of size 5 mayhave up to 238 unit pulses, a section of size 4,3,2,1 may have up to 512unit pulses. As the shorter sections with a high number of pulses arecreated dynamically by band-splitting, they are more infrequent than thelonger sub vector sizes. The WMOPS figures in the Result Tables belowinclude: (PVQ-pre-search, PVQ-fine search , PVQ-normalization andPVQ-indexing.). The “% identical” figures in the Result Tables below, isthe number of identical vectors found in the evaluated limited precisionshape search Algorithm, compared to an unconstrained floating point PVQshape search.

Result Tables

TABLE 1 Results for final K <=127 Pulses <=127, Algorithm Min Seg- %Worst En{energy-bits} × SNR SNR identical Case AverageCorrSq{corrSq-bits} (dB) (dB) vectors WMOPS WMOPS Remark Mixed 4.771188.803 99.3 6.843 5.496 16 × 16 always “En16 × CorrSq16”/“En32 × used,WC (worst CorrSq32”, case) in 16 × 16 pre_analyze max(x_abs) Locked“En16 × CorrSq16” 4.771 188.803 99.3 6.843 No change as pre_analyzemax(x_abs) energy never exceeds 16 bits “En16 × CorrSq16” using a −6.021180.556 94.6 6.826 5.476 Algorithm is bit known art correlation scalingworse (lower method “OPUS”, using minSNR accumulated number of unit lessidentical hits,) pulses. at very similar complexity Locked “En32 ×CorrSq16”, 4.771 188.803 99.3 8.970 6.961 Unnecessary to pre-analyzemax(x_abs) Use En32 for pulses <=127, as energy never exceeds 16 bitsdynamics Locked “En16 × CorrSq32”, 190.0 190.0 100 9.386 7.248 2.5 WMOPSextra pre-analyze input max(x_abs) required for the last 0.7% hitsLocked “En32 × CorrSq32”, 190.0 190.0 100 10.474 7.999 Unnecessary 0.9pre-analyze max(x_abs) WMOPS increase compared to Locked “En16 ×CorrSq32”,

TABLE 2 Results for K >127 Pulses >127 Algorithm Worst En{energy-bits} ×minSNR segSNR % identical Case- Average- CorrSq{corrSq-bits} (dB) (dB)vectors WMOPS WMOPS Remark Mixed AccEn 32.686 160.316 80.4% 6.843 5.496A good controlled (WC still enough “En16 × CorrSq16”/ from 16 × 16solution “En32 × CorrSq32”, sections) WC is still for pre_analyze input,16 × 16, WC is acc. energy not increased controlled precision MixedAccEn 32.686 130.258 59.3% n/a n/a Energy controlled information is“En16 × CorrSq16”/ occasiionaly “En16 × CorrSq32” truncated, pre_analyzeinput, causing low acc. energy SNR controlled precision Mixed AccEn32.686 117.634 50.6% n/a n/a Correlation controlled information “En16 ×CorrSq16”/ has low “En32 × CorrSq16” precision, pre_analyze input,causing low acc. energy SNR controlled precision Locked “En16 × 32.686113.629 47.8% n/a n/a Energy CorrSq16”,, information pre_analyzeoccasionaly input, truncated and correlation in information has lowprecission, causing low SNR Locked “En32 × 32.686 117.634 50.6% n/a n/aCorrelation CorrSq16”, information pre_analyze input has low precision,causing low SNR Locked “En16 × 40.994 159.714 78.8% n/a n/a EnergyCorrSq32”, information is pre_analyze input occasiionaly truncated,causing low SNR Locked “En32 × 49.697 189.773 99.8% 7.1 5.7  WC now inCorrSq32”, 32 × 32 pre_analyze input section, higher complexity WC

1. A method for Pyramid Vector Quantizer (PVQ) shape search, performedby an audio encoder, the PVQ taking a target vector x as input andderiving a vector y by iteratively adding unit pulses in an innerdimension search loop, the method comprising: before entering a nextinner dimension search loop for unit pulse addition, determining, basedon a maximum pulse amplitude, maxamp_(y), of a current vector y, whethermore than a current bit word length is needed to represent, in alossless manner, a variable, enloop_(y), related to an accumulatedenergy of y, in the next inner dimension search loop.
 2. The methodaccording to claim 1, wherein the method further comprises: beforeentering the next inner dimension search loop for unit pulse addition,determining, based on a maximum absolute value, xabs_(max), of the inputvector, x, a possible upshift, in a bit word, of the next loop'saccumulated in-loop correlation value, corr_(xy), between x and thevector y.
 3. The method according to claim 1, further comprising: whenmore than the current bit word length is needed to represent enloop_(y),performing the inner loop calculations using a longer bit word length torepresent enloop_(y).
 4. The method according to claim 1, furthercomprising: when more than the current bit word length is needed torepresent enloop_(y), performing the inner loop calculations using alonger bit word length to represent a squared accumulated in-loopcorrelation value, corr_(xy) ², between x and the vector y, in the innerloop.
 5. The method according to claim 1, further comprising: when morethan the current bit word length is not needed to represent enloop_(y),performing the inner loop calculations by employing a first unit pulseaddition loop using a first bit word length to represent enloop_(y); andwhen more than the current bit word length is needed to representenloop_(y), performing the inner loop calculations by employing a secondunit pulse addition loop using a longer bit word length to representenloop_(y) than the first unit pulse addition loop.
 6. The methodaccording to claim 1 further comprising: when more than the current bitword length is not needed to represent enloop_(y), performing the innerloop calculations by employing a first unit pulse addition loop having acertain precision; and when more than the current bit word length isneeded to represent enloop_(y), performing the inner loop calculationsby employing a second unit pulse addition loop having a higher precisionthan the first unit pulse addition loop.
 7. The method according toclaim 1, wherein the determining, based on maxamp_(y), of whether morethan the current bit word length is needed to represent enloop_(y)comprises determining characteristics of the case when, in the nextinner dimension search loop, aunit pulse is added to the position in ybeing associated with maxamp_(y).
 8. The method according to claim 1,further comprising: in the inner dimension search loop for unit pulseaddition: determining a position, n_(best), in y for addition of a unitpulse by evaluating a cross-multiplication, for each position n in y, ofa correlation and energy value for the current n; and a squaredcorrelation, BestCorrSq and an energy value, bestEn, saved from previousvalues of n, as:corr_(xy) ²*bestEn>BestCorrSq*enloop_(y) where $\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestEn} = {enloop}_{y}} \\{{BestCorrSq} = {corr}_{xy}^{2}}\end{matrix} \right\},{{{when}\mspace{14mu} {corr}_{{xy}\;}^{2}*{bestEn}} > {{BestCorrSq}*{enloop}_{y}}}} & \;\end{matrix}$
 9. The method according to claim 1, further comprising:keeping track of maxamp_(y) when a final value of K, associated with thetarget vector x, exceeds a threshold value.
 10. A computer programproduct comprising a non-transitory computer readable medium storing acomputer program comprising instructions which, when executed on atleast one processor, cause the at least one processor to carry out themethod according to claim
 1. 11. The computer program product accordingto claim 12, wherein at least one of the at least one processors is aDigital Signal Processor.
 12. An audio encoder configured for PyramidVector Quantization (PVQ) shape search, the PVQ taking a target vector xas input and deriving a vector y by iteratively adding unit pulses in aninner dimension search loop, the audio encoder being configured to:before entering a next inner dimension search loop for unit pulseaddition, determine, based on a maximum pulse amplitude, maxamp_(y), ofa current vector y, whether more than a current bit word length isneeded to represent, in a lossless manner, a variable, enloop_(y),related to an accumulated energy of y, in the next inner dimension loop.13. The audio encoder according to claim 12, being further configuredto: before entering the next inner dimension loop for unit pulseaddition, determine, based on a maximum absolute value, xabs_(max), ofthe input vector, x, a possible upshift, in a bit word, of the nextloop's accumulated in-loop correlation value, corr_(xy), between x andthe vector y.
 14. The audio encoder according to claim 12, being furtherconfigured to: perform the inner loop calculations using a longer bitword length to represent enloop_(y), when more than the current bit wordlength is needed to represent enloop_(y).
 15. The audio encoderaccording to claim 12, being further configured to: perform the innerloop calculations by employing a first unit pulse addition loop using afirst bit word length when more than the current bit word length is notneeded to represent enloop_(y), and perform the inner loop calculationsby employing a second unit pulse addition loop using a longer bit wordlength than the first unit pulse addition loop when more than thecurrent bit word length is needed to represent enloop_(y).
 16. The audioencoder according to claim 12, being further configured to: perform theinner loop calculations by employing a first unit pulse addition loop,having a certain precision, when more than the current bit word lengthis not needed to represent enloop_(y); and perform the inner loopcalculations by employing a second unit pulse addition loop, having ahigher precision than the first unit pulse addition loop, when more thanthe current bit word length is needed to represent enloop_(y).
 17. Theaudio encoder according to claim 12, wherein the determining, based onmaxamp_(y), of whether more than the current bit word length is neededto represent enloop_(y) is configured to comprise determiningcharacteristics of the case when, in the next inner dimension searchloop, aunit pulse is added to the position in y being associated withmaxamp_(y).
 18. The audio encoder according to claim 12, being furtherconfigured to: in the inner dimension search loop for unit pulseaddition, determine a position, n_(best), in y for addition of a unitpulse by evaluating a cross-multiplication, for each position n in y, ofa correlation and energy value for the current n; and a correlation,BestCorrSq, and energy value, bestEn, saved from previous values of n,as:corr_(xy) ²*bestEn>BestCorrSq*enloop_(y) where $\begin{matrix}{\left. \begin{matrix}{n_{best} = n} \\{{bestEn} = {enloop}_{y}} \\{{BestCorrSq} = {corr}_{xy}^{2}}\end{matrix} \right\},{{{when}\mspace{14mu} {corr}_{{xy}\;}^{2}*{bestEn}} > {{BestCorrSq}*{enloop}_{y}}}} & \;\end{matrix}$
 19. The audio encoder according to claim 12, being furtherconfigured to keep track of maxamp_(y) when a number of final unitpulses, K, associated with the target vector x, exceeds a thresholdvalue.
 20. A communication device comprising the audio encoder accordingto claim 12.